Simple but Effective Approaches to Improving Tree-to-tree Model

نویسندگان

  • Feifei Zhai
  • Jiajun Zhang
  • Yu Zhou
  • Chengqing Zong
چکیده

Tree-to-tree translation model is widely studied in statistical machine translation (SMT) and is believed to be much potential to achieve promising translation quality. However, the existing models still suffer from the unsatisfactory performance due to the limitations both in rule extraction and decoding procedure. According to our analysis and experiments, we have found that tree-to-tree model is severely hampered by several rigid syntactic constraints: the both-side subtree constraint in rule extraction, the node constraint and the exact matching constraint in decoding. In this paper we propose two simple but effective approaches to overcome the constraints: utilizing fuzzy matching and category translating to integrate bilingual phrases and using head-out binarization to binarize the bilingual parsing trees. Our experiments show that the proposed approaches can significantly improve the performance of tree-to-tree system and outperform the state-of-the-art phrase-based system Moses.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparison of Three Decision-Making Models in Differentiating Five Types of Heart Disease: A Case Study in Ghaem Sub-Specialty Hospital

Introduction: cardiovascular diseases are becoming the main cause of mortality and morbidity in most countries. This research goal was to predict the types of heart diseases for more accurate diagnosis by data mining and neural network technics. Method: This research was an applied-survey study and after data preprocessing, three approaches of neural network, decision making tree and Bayes simp...

متن کامل

Comparison of Three Decision-Making Models in Differentiating Five Types of Heart Disease: A Case Study in Ghaem Sub-Specialty Hospital

Introduction: cardiovascular diseases are becoming the main cause of mortality and morbidity in most countries. This research goal was to predict the types of heart diseases for more accurate diagnosis by data mining and neural network technics. Method: This research was an applied-survey study and after data preprocessing, three approaches of neural network, decision making tree and Bayes simp...

متن کامل

Signal processing approaches as novel tools for the clustering of N-acetyl-β-D-glucosaminidases

Nowadays, the clustering of proteins and enzymes in particular, are one of the most popular topics in bioinformatics. Increasing number of chitinase genes from different organisms and their sequences have beenidentified. So far, various mathematical algorithms for the clustering of chitinase genes have been used butmost of them seem to be confusing and sometimes insufficient. In the...

متن کامل

Ensemble of M5 Model Tree Based Modelling of Sodium Adsorption Ratio

This work reports the results of four ensemble approaches with the M5 model tree as the base regression model to anticipate Sodium Adsorption Ratio (SAR). Ensemble methods that combine the output of multiple regression models have been found to be more accurate than any of the individual models making up the ensemble. In this study additive boosting, bagging, rotation forest and random subspace...

متن کامل

Estimation of Phosphorus Reduction from Wastewater by Artificial Neural Network, Random Forest and M5P Model Tree Approaches

This study aims to examine the ability of free floating aquatic plants to remove phosphorus and to predict the reduction of phosphorus from rice mill wastewater using soft computing techniques. A mesocosm study was conducted at the mill premises under normal conditions, and reliable results were obtained. Four aquatic plants, namely water hyacinth, water lettuce, salvinia, and duckweed were use...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011